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Abstract 

-*— • 

c/3 • Robust design has been widely recognized as a leading method in 

reducing variability and improving quality. Most of the engineering 

^. , statistics literature mainly focuses on finding point estimates of the 

optimum operating conditions for robust design. Various procedures 



for calculating point estimates of the optimum operating conditions 



~f\ ' are considered. Although this point estimation procedure is important 

~i for continuous quality improvement, the immediate question is "how 

accurate are these optimum operating conditions?" The answer for this 
is to consider interval estimation for a single variable or joint confidence 
regions for multiple variables. 

In this paper, with the help of the bootstrap technique, we de- 
velop procedures for obtaining joint confidence regions for the opti- 
mum operating conditions. Two different procedures using Bonferroni 
and multivariate normal approximation are introduced. The proposed 
methods are illustrated and substantiated using a numerical example. 

Keyword: Bootstrap; quality improvement; robust design; opti- 
mization; response surface. 



1 Introduction 



Robust design has been widely recognized as a leading method in reduc- 
ing variability in the quality characteristic and improving quality. It is 
also recognized that quality improvement activities are most efficient and 
cost-effective when implemented during the design stage. Because of their 
practicability, robust design techniques have found increased applications in 
many manufacturing industries. Many industries are of great interest in the 
potential for applying robust design principles and are seeking a role in the 
information revolution with robust design at its core. 

The primary goal of robust design is to minimize variation in the qual- 
ity characteristic of interest while keeping a process me an at the customer- 
identified target value. In order to achieve this goal, iTaguchil (|1986l ) in- 



troduced a systematic method for applying experimental design, which has 
become known as robust design. Even though the ad hoc robust design 
methods suggested by Taguchi remain controversial due to various mathe- 
matical imperfections, there is no serious disagreement among engineering 
researchers and practitioners about his basic philosophy. The controversy 
surrounding Taguchi 's assumptions, his exp erimental design, and his exper - 



imental analysis h as be e n we 



Box et al 



1 addr essed by 



Leon et al. 



(|1988h . lNairl (J1992 ). and[Tsuj (J199J). Consequently, researchers 



(|l987h .lBoxl (|l988h . 



have closely examined alternatives using well-established statistical tools 
from tradition al theories of expe r iment al designs. In an early attempt of 



such research, 



Vining and Mverd (|1990l ) introduced the dual response ap- 



proach based on a response surface methodology (RSM) as a superior alter- 
native for modeling process relationships by separately estimating the re- 
sponse functions of the process mean and variance. The mean target value 



and variance of the target value are assumed to be polynomial functions 
of the various possible design points. Multiple experiments are conducted 
at the various design points in order to obtain estimates of the mean and 
variance of the target value at the various design points. Then, given these 
estimates, the coefficients for the response surface functions are estimated 
using standard regression techniques and the functions are assumed to hold 
continuously and therefo re for points between the design points considered 



during the experiments. 



Vining and Myers! (J1990J) then obtain the optimal 



design point (usually somewhere between those used during the experiment) 
by minimizing an objective function that penalizes mean bias and variance. 
Thus, it achieves the primary goal of robust design by minimizing the process 
variance while adjusting the proc ess mean at the target. 



However, 



Lin and Tul (|1995l ) pointed out that the robust design solu- 



tions obtained from the dual response model may not necessarily be optimal 
since this model forces the process mean to be located at the target value 
and proposed the mean-squared-error model, relaxing the zero-bias assump- 
tion. While allowing some process bias, the resulting process variance is 
less than or at most equal t o the variance obtained from the model pro- 



posed by 



Vining and Myers! (|l990l ): hence, the mean-squared-error model 



may provide better (or at least equal) robust design solutions if the zero- 
bias assumption is not required. The robust design approach to determin- 
ing orjtimimi_vahiejyias_b^ r esearchers, includ 



ing 



Borror and Montgomery! (12 000) , 



Scibilia et al. 



2003), 



Govindaluri et al. 



20041 



Lee et al. 



(|2007ah . and 



2003 ) 



Lee et al 



Kim et al. 



(2007 



b. 



(2005). 



Lee and Park ( 



200J), 



Lee et al. 



As afore-mentioned, the majority of the research on robust design mod- 
elling have focused on methodological development of models for optimizing 



operating conditions. Although this is critically important for continuous 
quality improvement, the immediate question is "how accurate are these 
optimum operating conditions?" To the best of our knowledge, the answer 
for this question has not been properly addressed in industrial applications 
of dual response surface methods. We can consider interval estimation for a 
single variable or joint confidence regions for multiple variables as an answer 
for this question. 

With the help of the bootstrap technique, we develop procedures for ob- 
taining joint confidence regions for the optimum operating conditions. The 
single response surface approach uses the method of least squares to obtain 
the adequate single response functions, while the squared-loss approach uses 
two surfaces for process mean and variance. For the single response surface 
approach, the variance-covariance matrix of the regression coefficients can be 
easily obtained, so the derivation of the joint confid ence region for optimum 



condit ions is straightforward. For more details, see 



Myers and Montgomery 



(2002). With two response functions, however, it is quite difficult to obtain 
the variance-covariance matrix of the regression coefficients for both the 
mean and variance responses, particularly when the sample size is not large. 
This difficulty can be overcome by using bootstrap techniques. In an era 
of powerful computers, computer-intensive methods such as the bootstrap 
technique promise to be one of the mainstays of applied response surface 
methodology and engineering statistics in the years ahead. 

This paper is organized as follows. The basics of robust design are 
introduced in Section [21 The bootstrap approach to determining joint con- 
fidence region for the optimum conditions is described in Section [3l followed 
by a case study in Section U] and the paper ends with concluding remarks in 
Section [3 



2 Robust design based on response surface 

We consider a system with a response Y. This response depends on the 
levels of k control factors x = (x\,X2, ■ ■ ■ , x&). The following assumptions 
are generally made: 

(i) The response Y depends on x. Thus, Y can be viewed as a function 
of x, that is, Y = F(x\, X2, • • • , £&)■ But, the functional structure F(-) 
is either unknown or very complicated. 

(ii) The levels of the control factor X{ for % = 1,2, ... ,k are continuously 
quantitative. 

(hi) The levels of the control factor X{ for i = 1,2, ... ,k can be controlled 
by the experimenter. 



Following 



Vining and Mvera (|1990l ). the mean and variance response 



functions (surfaces) can be written as 

k k 



M(x) = /3 + ^2 PiXi + ^ Pii x i + 5Z &iJ X i X 3 + £r 



1=1 



i=l 



i<j 



and 



V(x) = 7o + ^ 7^ + 5Z 7iiX ^ + 5Z Tii^i^j + £v, 



8=1 



i=l 



respectively, where x = (a?i, X2, ■ ■ ■ , x^). To find the fitted response surfaces 
given above, we must regress the mean response M(x) and the variance re- 
sponse V(x) on the control factors, x±, X2, • • • , x^. Hence, we must estimate 
the mean and variance responses at each design point. The most popular 
estimation method is to find the maximum likelihood estimates, assuming 
that the error variables e m and e v are normally distributed. Suppose that 



n replicates are taken at the ith design point. Let Yij represent the jth 
response at the ith design point where i = 1,2, ... ,k and j = 1, 2, . . . , n. 
The most popular estimators of the location and scale parameters are mean 
and variance, respectively. The maximum likelihood estimates of the mean 
and variance at the ith design point are the sample mean and variance as 
shown below. 

1 n 1 n 

^ = -Y J Y v and sf = — -Vo^-F,) 2 . 

n ^-^ n — 1 ^-^ 

i=i i=i 

Let M(x) and V^(x) represent the fitted response functions for the mean 
and variance of the response Y, respectively. Assuming a second-order poly- 
nomial model for the response functions, we get 

k k k 

M(x) = $q + J2 fax + J2J2 &i XiX i W 

i=l i=l j=i 

and 

k k k 

V(yi) = 7o + J2 liXi + ^2 ^2 %XiXj . (2) 

j=l i=l j=i 

We use the sample mean and variance of Y to estimate the process mean 
M(x) and variance V"(x), respectively. 

The main goal of robust design is to obtain the optimum operating 
conditions of control factors, x\,X2, ■ ■ ■ ,Xk, and this can be achieved by 
employing the following squared-loss optimization model: 

minimize |M(x) — To} + V(x) 

subject to 

Xj € [Lj,Uj] for j = 1,2, ...,k, 

where To is the customer-identified target value for the quality characteristic 
of interest, and the constraint specifies the feasible joint region of operating 



covariates given by x = (x\,X2, ■ ■ ■ ,Xk)- When factorial designs with k 
levels are used, the constraint becomes Xj € [Lj, Uj] for j = 1, 2, . . . , k. The 
control factors (xj) solving the optimization problem above are the optimal 
design point estimates. 

When considering the process variance on the left hand side of the 
regression model in ([2]), one often uses the log-transformed values of the 
sample variances, i.e., log(Sf), since a linear model for the variance pro- 
cess does not guarantee that t he predicted values are always positive; see 
Myers and Montgomery! (|2002l ). Using the log-transform, we can avoid the 
problem of the negative estimates in the variance process. After estimating 
the logarithm of the process variance, VJ og (x), we can obtain the optimal 
operating conditions by minimizing 



subject to 



It is not eworthy that the 
proposed by 



{M(x)-T } 2 + exp(Vi og (x)) 

Xj G [Lj, Uj] for j = 1, 2, . . . , k. 

"ollowing dual-response optimization model 



Vining and Myers 



(J1990I ) can also be used for optimization 



purposes: 



subject to 



minimize V"(x) 



M(x) = T and Xj £ [Lj,Uj] for j = 1,2, 



A:. 



However, the dual-response model strictly imposes a zero-bias condition 
while the squared-loss model allows some bias (i.e., absolute value of the 
difference of M(x) and To). This squared-loss model often results in less 



variability, and will be the focus of this paper. For detailed informati on 
regarding the squared-loss model, readers may refer to lLin and Tul ()1995l ). 



3 Joint Confidence Region using Bootstrapped 
Samples 



The bootstrap technique was first developed by lEfronl (]1979l ) and this 
method has become one of the most popular computer-intensive statistical 
methods. The technique is simple yet powerful. The key idea is to retake 
samples from the original data in order to create re-sampled data sets from 
which the variability of the quantities of interest can be assessed without 
long and error-prone analytical calculations. 

As afore-mentioned, unlike the single response surface method, it is 
quite difficult to obtain the theoretical variance-covariance matrix of the 
regression coefficients for both the mean and variance responses when two 
response functions are considered. The alternative is to calculate the statis- 
tic of interest from simulated data sets using the Bootstrap re-sampling 
technique. We denote such a simulated data set as Y*,. It is standard to 
let the superscript notation (*) denote a bootstrapped or re-sampled value. 
The statistic of interest (optimum operating conditions) is calculated with a 
simulated data set. By simulating B times, we obtain B simulated optimum 
operating conditions. Using these conditions, we can obtain the joint con- 
fidence region of the optimum operating conditions of control factors. The 
bootstrap algorithm is as follows: 

For 6=1,2,..., B: 

1 Draw Yft, Y£, . . . , Y* n with replacement from Yn,Yi2, • • • , Y; m for i = 



1,2,...,*. 

2 Find the optimum operating conditions using the above data set. We 
denote this as (x^^x^). 

3 Repeat the above two steps for b = 1, 2, . . . , B. 

Then we obtain B simulated optimum operating conditions (x° c £ , i^T) for 
b = 1, 2, . . . , B. There are several ways of finding the joint confidence regions 
using boot strap ped samples. For more de tails, the reader is referred to 



Halll (jl992l ) and iDavison and Hinklevl (| 19971 ) . Here we briefly introduce two 



methods: 

(i) The construction of a rectangular region using a Bonferroni argument. 

(ii) The construction of an elliptical contour based on an approximation 
to the multivariate normal distribution. 

In the following section, we describe these two methods using a numerical 
example with multi- filament microfiber tows. 

4 A Case Study 

A company produces multi-filament microfiber tows. We conduct an ex- 
periment to determine the quality effect of control covariates. For such 
products, the key control factors are polymer temperature {x\^) and poly- 
mer feeding speed (x2i)- The diameter (Y) of the microfiber is the most 
important quality issue and its nominal target value is To = 50 microns. 
The 3x3 factorial design taken at each design point for i = 1, 2, . . . , 9 and 
j = 1, 2, . . . , 10 is shown in Table [TJ Here, the variables xn and X2,i are 
centered and re-scaled from the natural variables so that x\ j and X2 i are in 



[—1, 1]. We then obtain the estimate of the mean response, M(x), and the 
estimate of the log-transformed variance response, Vi og (x), as follows. 

M(x) = 51.741 + 7.750xi + 8.053x 2 + 20.262x? + 19.939x^ - 0.038xix 2 . 
Vi og (x) = 0.841 - 0.015x1 - 0.068x2 + 0.620x? + 0A21x 2 2 - 0.339xix 2 . 

By minimizing 

{M(x)-50} 2 + exp(Viog(x)) 

subject to 

|xi| < 1 and |x 2 | < 1, 

the optimum operating conditions are obtained as 

x oc = (x?,x?) = (-0.168,-0.179). 

Two methods, the construction of a rectangular confidence region us- 
ing a Bonferroni argument and the construction of an elliptical confidence 
region based on a multivariate normal distribution, are used to find the joint 
confidence region for the optimum operating conditions. For each bootstrap 
data set Y* where i = 1,2, ... ,9 and j = 1,2, . . . , 10, we obtain the following 
simulated estimates of the optimum conditions as shown in Table [2j 

A Bonferroni argument can be used to find the rectangular confidence 
region. Suppose that 9 is d-dimensional and the joint confidence region 
C a = (C?\C% 2 ,. . . ,C* d ) is rectangular, with the interval Cf = (9^,9^) 
for the ith component, where % = 1, 2, . . . , d. Then we have 

d d d 

i=l i=l i=l 

If we take on = a/d, then the region C a covers at least 100(1 — a)%. In 
our particular example, the region is two-dimensional, so we have d = 2. If 

10 



Table 1: Data for case study example. 



i 


Xl,i 


X2,i 






Y 






Y t 


&i 


~T~ 


-1 


-1 


73.94 


76.09 


73.39 


79.82 


76.47 












73.43 


76.89 


77.55 


77.12 


74.79 


75.949 


4.263 


2 





-1 


67.30 


64.55 


62.08 


58.18 


66.36 












63.49 


63.56 


65.91 


65.61 


65.05 


64.209 


6.853 


3 


1 


-1 


94.03 


93.67 


91.80 


86.34 


93.24 












91.45 


91.19 


87.71 


90.33 


92.71 


91.247 


6.390 


4 


-1 





66.93 


63.35 


64.55 


63.47 


60.23 












62.58 


62.63 


63.45 


66.29 


65.47 


63.895 


3.922 


5 








51.23 


51.03 


53.16 


52.84 


50.06 












50.02 


52.42 


53.32 


51.35 


53.57 


51.900 


1.775 


6 


1 





80.58 


78.10 


80.44 


76.83 


83.11 












84.45 


78.70 


77.04 


81.00 


79.27 


79.952 


6.181 


7 


-1 


1 


97.95 


91.50 


93.42 


91.67 


89.63 












92.10 


86.82 


95.48 


92.01 


97.35 


92.793 


11.631 


8 





1 


80.76 


77.86 


81.10 


77.31 


76.53 












80.31 


78.51 


79.60 


79.78 


78.16 


78.992 


2.377 


9 


1 


1 


106.10 


107.24 


110.72 


103.57 


109.17 












108.48 


110.41 


106.80 


108.58 


108.31 


107.938 


4.495 



we want to find the 90% joint confidence region, we can use a\ = 0.05 and 
OL2 = 0.05. To obtain the joint confidence, we need to find the confidence 
interval C"* for i = 1, 2. That is, we find the 95% confidence intervals of x° c b 
and X2%, separately. First, we will find the interval of x° c b . For convenience, 
we denote x° c b as 6 and the estimator of x^ c b as T. The realization of the 
estimator T is denoted by t, and its bootstrap simulated values are denoted 
by t*. To find the confidence interval, we need to estimate quantiles for T — 9 
and these are approximated using the bootstrap quantile of t* — t. The pth 
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Table 2: Simulated estimates of the optimum conditions from bootstrap 
data sets. 



1 2 3 4 5 6 ••• 999 



1,6 



-0.268 -0.254 -0.154 -0.079 -0.207 -0.202 ••• -0.169 
xf£ -0.141 -0.089 -0.207 -0.146 -0.207 -0.197 ■■■ -0.196 
M(xg c *) 50.024 50.060 50.255 50.036 50.692 50.399 ••• 50.503 



quantile of T — 6 is estimated by the (B + l)pth ordered va 



Davison and Hinklev 



ue of t* — t, that 



(|1997l ). an equi- 



is t?, B+1 \ | — t. Then, as described in 

tailed 100(1 — a\)% confidence interval will have the following endpoints: 

t ~ (%B+l)(l-ai/2)] ~ *) ancl t ~ (*f(B+l)ai/2] ~ *)• 

In this particular bootstrap simulation, we used B = 999. Thus, we have 

%B+l)(l-ai/2)] = *f975] and **(B+1)(qi/2)] = **25] ■ 

Similarly, we can find the confidence interval for x^V Although the 
Bonferroni argument has a long history in statistics, it is well known that 
the Bonferroni joint confidence region is wider than what they should be at 
a given confidence level and it is therefore conservative. Also, with plausible 
likelihood function contours, a circular or elliptic shape could be more appro- 
priate than a rectangular shape. One possible simple remedy for these defi- 
ciencies of the Bonferroni method is to use the classical normal-distribution 
approximation method. In our example, the region is two-dimensional, so a 
joint confidence region is an ellipse. If the d-dimensional estimator T of 6 is 
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Figure 1: Scatter plot of bootstrap results of the optimum operation condi- 
tions. 



approximately multivariate normal, then it is well known that the quadratic 
quantity Q shown below has an approximate x d distribution 

Q(8) = (T-ey±~\T-o), 

where S is the estimated variance-covariance matrix of T. 

If Q has exact chi-square quantiles Xd(p)i then a 100(1 — a)% joint 
confidence region for is 

{e:Q(0)< Xd (l-a)}. 

However, such chi-square quantiles are often unreliable, so it makes sense to 
use the bootstrap approximation of the distribution of Q to find quantiles 
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Figure 2: Blowup plot of Figure [T] with marginal histograms and kernel 
densities. 



of Q. The bootstrap analogue of Q is 

Q*(t) 

which is calculated for each of the B bootstrap samples and its calculation is 



(T* -t)'£* 1 (T* -t), 



denoted by q* . If we denote the ordered bootstrap values by qt, , <jrjL , . . . , q? B -, 
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then 100(1 — a)% bootstrap joint confidence region is given by 

{0:(t-oy±- 1 (t-o)< q * [{B+1){1 _ a)] }. 



In the calculation of Q*, we need to find the variance estimator 5] . One 
general way to obtain the value for 5] is to calculate 

t* = - — ^2(t** - t**)(t** - 1**)', 

where t** = (1//) J2k=i *fc* an< ^ ***> *2*> •••>*/* are calculated by bootstrap 
re-sampling from the bootstrap sample for each of the B simulated samples. 
Since t** are obtained by using bootstrap re-sampling of the bootstrapped 
sample, the double superscript notation (**) is used. Typically, I is in the 
range between 50 and 200. In this example, we set I = 100. For convenience, 
we denote 

= {xf^f)' and T = (£f ,xf)'. 

Figures [T] and [2] show the scatter plots of B = 999 pairs of the optimum 
conditions based on the bootstrap samples. Figure [2] is the blowup plot of 
Figure Q] with the marginal histograms and kernel density estimates of £° c fe * 
and x^ ■ The values of the optimum conditions from the original sample is 
superimposed on the plot. The vertical dotted line is x° c = —0.168 and the 
horizontal dotted line is x^ = —0.179. 

The rectangular region is the 90% joint confidence region based on Bon- 
ferroni simultaneous confidence intervals, while the 90% confidence elliptic 
region is based on the multivariate normal approximation. The bootstrap 
biases defined as 



B 
B 



— V^ -f OC - r 



6=1 
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49.0 



49.5 



50.0 



50.5 



51.0 



51.5 



52.0 



Figure 3: Histogram of the bootstrap mean response values. 

for i = 1,2 are 0.00208 and 0.00411, respectively. It is noteworthy that 
the rectangular confidence region deviates somewhat from the main body of 
the scatter plot downward and toward the left. This deviation comes from 
the bootstrap bias and skewness of the first and second components of the 
optimum operating conditions. 

We can also find the confidence interval of the mean response at the 
optimum conditions. Using this bootstrap technique, we can find the boot- 
strap estimates of the mean response at the optimum, that is, M(x£ c *). 
The mean response estimate at the optimum from the original sample is 
50.30. Using the simulated bootstrap samples, we can find the 90% boot- 
strap confidence interval, (49.36,50.59). Figure [3] also shows the histogram 
of bootstrap mean response values, where the dotted vertical lines are the 
end points of the confidence interval and the vertical dashed line is the 



16 



mean response estimate at the optimum from the original sample. The bias 
of the bootstrap estimate is 0.17. Because of this bias, we observe that the 
confidence interval moves toward the left. 

5 Concluding remarks 

In this paper, we developed methods of constructing joint confidence re- 
gions for optimum operating conditions using the bootstrap technique. Two 
different procedures were developed: A Bonferroni type procedure that con- 
structed a rectangular region and multivariate normal approximation pro- 
cedure that constructed an elliptical region. The proposed methods were 
illustrated and substantiated using a numerical example involving multi- 
filament microfiber tows. 
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